class: center, middle, inverse, title-slide .title[ # Data Manipulation & Visualization ] .subtitle[ ## An Introduction to the {tidyverse} with {refugees} & {unhcthemes} ] .date[ ###
] --- # Learning objectives - An intro to Rstudio Development Interface - Basic of the grammar of data manipulation with [**`dplyr`**](https://dplyr.tidyverse.org/) & [**`tidyr`**](https://tidyr.tidyverse.org/) using [**`{refugees}`**](https://www.unhcr.org/refugee-statistics/insights/explainers/refugees-r-package.html) package - Basic of the grammar of graphics with [**`ggplot2`**](https://ggplot2.tidyverse.org/index.html) - Get branded visuals with [**`{unhcrthemes}`**](https://unhcr-dataviz.github.io/unhcrthemes/) package --- # Learning stages... .pull-left[  ] .pull-right[ __Step 1.__ Develop an understanding of what data science is and what concepts are needed for it __Step 2.__ Break data science challenges into small steps - Acquire basic command syntax through very practical and focused project - Data Manipulation & visualization __Step 3.__ Develop Reproducible Analysis Workflow with function and report template - Understand the relevance, inputs, constraints, and limitations of the various techniques __Step 4.__ Optimize your problem solving approaches in elegant ways - Build packages & ShinyApp ] ??? https://towardsdatascience.com/the-stages-of-learning-data-science-3cc8be181f54 See Video - https://www.youtube.com/watch?v=hpMc6TgT34I --- class: inverse, center, middle # Using Rstudio --- ## Integrated Development Interface (IDE) .pull-left[ Environment History Files Plots Packages Help Source Console/Terminal ] .pull-right[  ] --- ## Rstudio - Source .pull-left[ Where you write your R code or document content If you are writing a R Markdown document, you can render it in this area To run a R code, you can use the shortcut Crtl + Enter ] .pull-right[  ] --- ## Rstudio – Console/Terminal .pull-left[ Where you can check the execution of commands and where the code is evaluated Can perform quick calculations (that you do not need to save) ] .pull-right[  ] --- ## Rstudio - Environment/History .pull-left[ Where you can see the objects in your working space You can also view your command history (History tab) ] .pull-right[  ] --- ## Rstudio – Files/Plots/Packages/Help .pull-left[ You can see the file directories View plots View which R packages are installed. You can also update the packages ] .pull-right[  ] R help --- class: inverse, center, middle # Manipulate data on {refugees} with {dplyr} & {tidyr} --- ### Install the {tidyverse}, a collection of opinionated packages! Got to your [locally installed Rstudio](https://www.rstudio.com/products/rstudio/download/#download) or [sign-up for a free Rstudio Cloud account](https://login.rstudio.cloud/register?redirect=https%3A%2F%2Fclient.login.rstudio.cloud%2Foauth%2Flogin%3Fshow_auth%3D0%26show_login%3D0) .pull-left[ First create a new project within R studio and then make sure we have the [tidyverse](https://www.tidyverse.org/packages/) ```r # Tidyverse if (!require("tidyverse")) install.packages("tidyverse", dependencies = TRUE) if (!require("here")) install.packages("here") ``` ] .pull-right[  ] ??? The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures. --- ## The {refugees} Data Package A package can be made of both functions and data. {refugees} is easy-to-use interface to the datasets served normally through the API, which cover forcibly displaced populations, including refugees, asylum-seekers and internally displaced people, stateless people, and others over a span of more than 70 years. This package provides data from three major sources: .pull-left[ Data from UNHCR’s annual statistical activities dating back to 1951. Data from the United Nations Relief and Works Agency for Palestine Refugees in the Near East (UNRWA), specifically for registered Palestine refugees under UNRWA’s mandate. Data from the Internal Displacement Monitoring Centre (IDMC) on people displaced within their country due to conflict or violence. ] .pull-right[ ```r ## 2 options to install # From CRAN # install.packages("refugees") # From Github # pak::pkg_install("PopulationStatistics/refugees") ## Once installed, load the package! library(refugees) ``` ] --- ## 8 sub datasets in the package - update every 6 months 1. `population`: Data on forcibly displaced and stateless persons by year, including refugees, asylum-seekers, internally displaced people (IDPs) and stateless people. Detailed definitions of the different population groups can be found on the methodology page of the Refugee Data Finder. 2. `idmc`: Data from the Internal Displacement Monitoring Centre on the total number of IDPs displaced due to conflict and violence. 3. `asylum_applications`: Data on asylum applications including the procedure type and application type. 4. `asylum_decisions`: Data on asylum decisions, including recognitions, rejections, and administrative closures. 5.`demographics`: Demographic and sub-national data, where available, including disaggregation by age and sex. 6. `solutions`: Data on durable solutions for refugees and IDPs. 7. `unrwa`: Data on registered Palestine refugees under UNRWA’s mandate. 8. `flows`: Numbers of the people forced to flee during each of the years since 1962. For more information, see the explaination of the forced displacement flow dataset. --- ## Data Manipulation filter, group_by, summarise, slice ```r library(refugees) library(dplyr) library(tidyr) ref_coo_10 <- refugees::population |> dplyr::filter(year == 2022) |> dplyr::group_by(coo_name) |> dplyr::summarise(refugees = sum(refugees, na.rm = TRUE) + sum(oip, na.rm = TRUE) ) |> dplyr::slice_max(order_by = refugees, n = 10) ``` --- ## Data Manipulation left_join ```r fd_last_ten_years <- refugees::population |> dplyr::filter(year >= 2022-9 & year < 2023) |> dplyr::summarise(refugees = sum(refugees, na.rm = TRUE), asylum_seekers = sum(asylum_seekers, na.rm = TRUE), oip = sum(oip, na.rm = TRUE), .by = year) |> dplyr::left_join(refugees::idmc |> filter(year >= 2022-9) |> summarise(idmc = sum(total, na.rm = TRUE), .by = year), by=c("year")) |> dplyr::left_join(refugees::unrwa |> filter(year >= 2022-9) |> summarise(unrwa = sum(total, na.rm = TRUE), .by = year), by=c("year")) ``` --- ## Data Manipulation pivot_longer ```r fd_last_ten_years <- fd_last_ten_years |> tidyr::pivot_longer(cols = -year, names_to = "population_type", values_to = "total") |> dplyr::mutate(population_type = factor(population_type, levels=c("oip", "unrwa","asylum_seekers", "refugees","idmc")), population_type = dplyr::recode(population_type, refugees="Refugees under UNHCR’s mandate", asylum_seekers="Asylum-seekers", oip="Other people in need of international protection", idmc="Internally displaced persons", unrwa="Palestine refugees under UNRWA’s mandate")) |> dplyr::arrange(year, population_type) ``` --- ## Data Manipulation sum with na.rm ```r demo_2022 <- refugees::demographics |> filter(year == 2022 & pop_type %in% c("REF", "ASY", "IDP", "OIP", "RDP", "RET", "STA", "OOC")) |> summarise("male 0-17" = sum(m_0_4, na.rm = TRUE) + sum(m_5_11, na.rm = TRUE) + sum(m_12_17, na.rm = TRUE), "male 18-59"= sum(m_18_59, na.rm = TRUE), "male 60+"=sum(m_60, na.rm=TRUE), "female 0-17" = sum(f_0_4, na.rm = TRUE) + sum(f_5_11, na.rm = TRUE) + sum(f_12_17, na.rm = TRUE), "female 18-59"= sum(f_18_59, na.rm = TRUE), "female 60+"=sum(f_60, na.rm=TRUE)) |> pivot_longer(cols=everything(), names_sep = " ", names_to = c(".value", "ages")) |> mutate(male_p = round(male/(sum(female) + sum(male)), 2 ), female_p = round(female/(sum(female) + sum(male)), 2 )) ``` --- class: inverse, center, middle # Visualise with {ggplot2} and {unhcrthemes} ### Introduction --- ## The `ggplot2` package .pull-left[ - **`ggplot2`** is an R package for declaratively creating graphics - **`ggplot2`** is an implementation [The Grammar of Graphics](https://link.springer.com/chapter/10.1007/978-3-642-21551-3_13) by Leland Irving - **The idea** don't start with the final form of the graphic (Excel approach) but **decompose the graphic** into its constituents ] .pull-right[ .center[] ] ??? You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details. What does it take to create a graphic? Data, axis, geometric objects, etc. --- ## Structure of `ggplot2` How a **`ggplot2`** graph is built on the grammar of graphics elements: <table class="table" style="font-size: 16px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Layer </th> <th style="text-align:left;"> Function </th> <th style="text-align:left;"> Explanation </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;font-weight: bold;"> Data </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> ggplot(data) </td> <td style="text-align:left;"> The raw data that you want to plot. </td> </tr> </tbody> </table> ??? 1. Data - without data, you don't have a plot! 2. Mapping - linking variables to graphical properties. 3. Geometries - interpret aesthetics as graphical representations. 4. Statistics - compute/transform numbers for us. 5. Scales - interpret values in data to graphical properties. 6. Coordinates - define physical mapping. 7. Facets - split plot into panels. 8. Theme - what does your plot look like? --- ## Structure of `ggplot2` How a **`ggplot2`** graph is built on the grammar of graphics elements: <table class="table" style="font-size: 16px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Layer </th> <th style="text-align:left;"> Function </th> <th style="text-align:left;"> Explanation </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;font-weight: bold;"> Data </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> ggplot(data) </td> <td style="text-align:left;"> The raw data that you want to plot. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Aesthetics </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> aes() </td> <td style="text-align:left;"> Aesthetics mappings of the geometric and statistical objects, such as position, color, size, shape, and transparency. </td> </tr> </tbody> </table> ??? 1. Data - without data, you don't have a plot! 2. Mapping - linking variables to graphical properties. 3. Geometries - interpret aesthetics as graphical representations. 4. Statistics - compute/transform numbers for us. 5. Scales - interpret values in data to graphical properties. 6. Coordinates - define physical mapping. 7. Facets - split plot into panels. 8. Theme - what does your plot look like? --- ## Structure of `ggplot2` How a **`ggplot2`** graph is built on the grammar of graphics elements: <table class="table" style="font-size: 16px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Layer </th> <th style="text-align:left;"> Function </th> <th style="text-align:left;"> Explanation </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;font-weight: bold;"> Data </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> ggplot(data) </td> <td style="text-align:left;"> The raw data that you want to plot. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Aesthetics </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> aes() </td> <td style="text-align:left;"> Aesthetics mappings of the geometric and statistical objects, such as position, color, size, shape, and transparency. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Geometries </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> geom_*() </td> <td style="text-align:left;"> The geometric shapes that will represent the data. </td> </tr> </tbody> </table> ??? 1. Data - without data, you don't have a plot! 2. Mapping - linking variables to graphical properties. 3. Geometries - interpret aesthetics as graphical representations. 4. Statistics - compute/transform numbers for us. 5. Scales - interpret values in data to graphical properties. 6. Coordinates - define physical mapping. 7. Facets - split plot into panels. 8. Theme - what does your plot look like? --- ## Structure of `ggplot2` How a **`ggplot2`** graph is built on the grammar of graphics elements: <table class="table" style="font-size: 16px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Layer </th> <th style="text-align:left;"> Function </th> <th style="text-align:left;"> Explanation </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;font-weight: bold;"> Data </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> ggplot(data) </td> <td style="text-align:left;"> The raw data that you want to plot. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Aesthetics </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> aes() </td> <td style="text-align:left;"> Aesthetics mappings of the geometric and statistical objects, such as position, color, size, shape, and transparency. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Geometries </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> geom_*() </td> <td style="text-align:left;"> The geometric shapes that will represent the data. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Statistical transformations </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> stat_*() </td> <td style="text-align:left;"> Statistical summaries of the data, such as quantiles, fitted curves, and sums. </td> </tr> </tbody> </table> ??? 1. Data - without data, you don't have a plot! 2. Mapping - linking variables to graphical properties. 3. Geometries - interpret aesthetics as graphical representations. 4. Statistics - compute/transform numbers for us. 5. Scales - interpret values in data to graphical properties. 6. Coordinates - define physical mapping. 7. Facets - split plot into panels. 8. Theme - what does your plot look like? --- ## Structure of `ggplot2` How a **`ggplot2`** graph is built on the grammar of graphics elements: <table class="table" style="font-size: 16px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Layer </th> <th style="text-align:left;"> Function </th> <th style="text-align:left;"> Explanation </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;font-weight: bold;"> Data </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> ggplot(data) </td> <td style="text-align:left;"> The raw data that you want to plot. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Aesthetics </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> aes() </td> <td style="text-align:left;"> Aesthetics mappings of the geometric and statistical objects, such as position, color, size, shape, and transparency. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Geometries </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> geom_*() </td> <td style="text-align:left;"> The geometric shapes that will represent the data. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Statistical transformations </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> stat_*() </td> <td style="text-align:left;"> Statistical summaries of the data, such as quantiles, fitted curves, and sums. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Scales </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> scale_*() </td> <td style="text-align:left;"> Maps between the data and the aesthetic dimensions, such as data range to plot width or factor values to colors. </td> </tr> </tbody> </table> ??? 1. Data - without data, you don't have a plot! 2. Mapping - linking variables to graphical properties. 3. Geometries - interpret aesthetics as graphical representations. 4. Statistics - compute/transform numbers for us. 5. Scales - interpret values in data to graphical properties. 6. Coordinates - define physical mapping. 7. Facets - split plot into panels. 8. Theme - what does your plot look like? --- ## Structure of `ggplot2` How a **`ggplot2`** graph is built on the grammar of graphics elements: <table class="table" style="font-size: 16px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Layer </th> <th style="text-align:left;"> Function </th> <th style="text-align:left;"> Explanation </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;font-weight: bold;"> Data </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> ggplot(data) </td> <td style="text-align:left;"> The raw data that you want to plot. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Aesthetics </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> aes() </td> <td style="text-align:left;"> Aesthetics mappings of the geometric and statistical objects, such as position, color, size, shape, and transparency. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Geometries </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> geom_*() </td> <td style="text-align:left;"> The geometric shapes that will represent the data. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Statistical transformations </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> stat_*() </td> <td style="text-align:left;"> Statistical summaries of the data, such as quantiles, fitted curves, and sums. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Scales </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> scale_*() </td> <td style="text-align:left;"> Maps between the data and the aesthetic dimensions, such as data range to plot width or factor values to colors. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Coordinate System </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> coord_*() </td> <td style="text-align:left;"> The transformation used for mapping data coordinates into the plane of the data rectangle. </td> </tr> </tbody> </table> ??? 1. Data - without data, you don't have a plot! 2. Mapping - linking variables to graphical properties. 3. Geometries - interpret aesthetics as graphical representations. 4. Statistics - compute/transform numbers for us. 5. Scales - interpret values in data to graphical properties. 6. Coordinates - define physical mapping. 7. Facets - split plot into panels. 8. Theme - what does your plot look like? --- ## Structure of `ggplot2` How a **`ggplot2`** graph is built on the grammar of graphics elements: <table class="table" style="font-size: 16px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Layer </th> <th style="text-align:left;"> Function </th> <th style="text-align:left;"> Explanation </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;font-weight: bold;"> Data </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> ggplot(data) </td> <td style="text-align:left;"> The raw data that you want to plot. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Aesthetics </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> aes() </td> <td style="text-align:left;"> Aesthetics mappings of the geometric and statistical objects, such as position, color, size, shape, and transparency. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Geometries </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> geom_*() </td> <td style="text-align:left;"> The geometric shapes that will represent the data. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Statistical transformations </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> stat_*() </td> <td style="text-align:left;"> Statistical summaries of the data, such as quantiles, fitted curves, and sums. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Scales </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> scale_*() </td> <td style="text-align:left;"> Maps between the data and the aesthetic dimensions, such as data range to plot width or factor values to colors. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Coordinate System </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> coord_*() </td> <td style="text-align:left;"> The transformation used for mapping data coordinates into the plane of the data rectangle. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Facets </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> facet_*() </td> <td style="text-align:left;"> The arrangement of the data into a grid of plots. </td> </tr> </tbody> </table> ??? 1. Data - without data, you don't have a plot! 2. Mapping - linking variables to graphical properties. 3. Geometries - interpret aesthetics as graphical representations. 4. Statistics - compute/transform numbers for us. 5. Scales - interpret values in data to graphical properties. 6. Coordinates - define physical mapping. 7. Facets - split plot into panels. 8. Theme - what does your plot look like? --- ## Structure of `ggplot2` How a **`ggplot2`** graph is built on the grammar of graphics elements: <table class="table" style="font-size: 16px; margin-left: auto; margin-right: auto;"> <caption style="font-size: initial !important;">Credit: Cedric Scherer</caption> <thead> <tr> <th style="text-align:left;"> Layer </th> <th style="text-align:left;"> Function </th> <th style="text-align:left;"> Explanation </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;font-weight: bold;"> Data </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> ggplot(data) </td> <td style="text-align:left;"> The raw data that you want to plot. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Aesthetics </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> aes() </td> <td style="text-align:left;"> Aesthetics mappings of the geometric and statistical objects, such as position, color, size, shape, and transparency. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Geometries </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> geom_*() </td> <td style="text-align:left;"> The geometric shapes that will represent the data. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Statistical transformations </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> stat_*() </td> <td style="text-align:left;"> Statistical summaries of the data, such as quantiles, fitted curves, and sums. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Scales </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> scale_*() </td> <td style="text-align:left;"> Maps between the data and the aesthetic dimensions, such as data range to plot width or factor values to colors. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Coordinate System </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> coord_*() </td> <td style="text-align:left;"> The transformation used for mapping data coordinates into the plane of the data rectangle. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Facets </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> facet_*() </td> <td style="text-align:left;"> The arrangement of the data into a grid of plots. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Visual Themes </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> theme_*() </td> <td style="text-align:left;"> The overall visual defaults of a plot, such as background, grids, axes, default typeface, sizes and colors. </td> </tr> </tbody> </table> ??? 1. Data - without data, you don't have a plot! 2. Mapping - linking variables to graphical properties. 3. Geometries - interpret aesthetics as graphical representations. 4. Statistics - compute/transform numbers for us. 5. Scales - interpret values in data to graphical properties. 6. Coordinates - define physical mapping. 7. Facets - split plot into panels. 8. Theme - what does your plot look like? --- ## {unhcrthemes} package .pull-left[ 1. **Branded** `ggplot2` theme ] .pull-right[ <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-16-1.png" width="2100" /> ] --- ## {unhcrthemes} package .pull-left[ 1. **Branded** `ggplot2` theme 2. A series of color palette for: - A **categorical palette** for UNHCR main data visualization colors - A **categorical palette** for people of concern to UNHCR categories - A **categorical palette** for geographical regional divisions of UNHCR - Six **sequential color palettes** for all the main data visualization colors - Two recommended **diverging color palette** ] .pull-right[ <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-17-1.png" width="2100" /> ] --- ## {unhcrthemes} package .pull-left[ 1. **Branded** `ggplot2` theme 2. A series of color palette for: - A **categorical palette** for UNHCR main data visualization colors - A **categorical palette** for people of concern to UNHCR categories - A **categorical palette** for geographical regional divisions of UNHCR - Six **sequential color palettes** for all the main data visualization colors - Two recommended **diverging color palette** 3. Available on [github](https://github.com/unhcr-dataviz/unhcrthemes/), dedicated [documentation page](https://unhcr-dataviz/.github.io/unhcrthemes/index.html) and throughout [examples of the data visualization platform](https://dataviz.unhcr.org/tools/r/). ] .pull-right[ .center[<img src="https://raw.githubusercontent.com/unhcr-dataviz/unhcrthemes/master/man/figures/logo.svg" alt="unhcrthemes HEX" style="max-width:60%">] ] --- class: inverse, center, middle # {ggplot2} and {unhcrthemes} ### In action --- ## Replicate an existing chart! .pull-left[ Replicate a chart example from the [Global Trends 2022](https://www.unhcr.org/globaltrends.html) webpage, using `ggplot2` and make it brand compliant with `unhcrthemes` packages. ] .pull-right[ <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-18-1.png" width="2100" /> ] --- ## Setup .pull-left[ ```r # Install packages # if needed uncomment lines below # install.packages('tidyverse') # remotes::install_github("unhcr-datavizunhcr-dataviz/unhcrthemes") # Load packages library(tidyverse) library(unhcrthemes) # Load data displ <- fd_last_ten_years |> dplyr::rename( num = total, pop = population_type) # Check data structure #View(displ) ``` ] .pull-right[ <table> <thead> <tr> <th style="text-align:right;"> Year </th> <th style="text-align:left;"> Population type </th> <th style="text-align:right;"> # of people </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 2013 </td> <td style="text-align:left;"> Other people in need of international protection </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:right;"> 2013 </td> <td style="text-align:left;"> Palestine refugees under UNRWA’s mandate </td> <td style="text-align:right;"> 5030049 </td> </tr> <tr> <td style="text-align:right;"> 2013 </td> <td style="text-align:left;"> Asylum-seekers </td> <td style="text-align:right;"> 1162934 </td> </tr> <tr> <td style="text-align:right;"> 2013 </td> <td style="text-align:left;"> Refugees under UNHCR’s mandate </td> <td style="text-align:right;"> 11698233 </td> </tr> <tr> <td style="text-align:right;"> 2013 </td> <td style="text-align:left;"> Internally displaced persons </td> <td style="text-align:right;"> 33340830 </td> </tr> <tr> <td style="text-align:right;"> 2014 </td> <td style="text-align:left;"> Other people in need of international protection </td> <td style="text-align:right;"> 0 </td> </tr> </tbody> </table> ] --- ## Data .pull-left[ ```r *ggplot(data = displ) ``` ] .pull-right[ <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-22-1.png" width="2100" /> ] ??? 1. Data - without data, you don't have a plot! But nothing happens here because we haven't mapped the raw data to anything. SO we just get a empty canvas. --- ## Aesthetics .pull-left[ ```r ggplot(data = displ, * aes(x = year, y = num)) ``` ] .pull-right[ <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-24-1.png" width="2100" /> ] ??? 2. Mapping - linking variables to graphical properties. We have now mapped the year to the x axis and the number displaced to y but we still don't see anything special except the axis value --- ## Geoms .pull-left[ ```r ggplot(data = displ, aes(x = year, y = num)) + * geom_col() ``` ] .pull-right[ <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-26-1.png" width="2100" /> ] --- ## Scale .pull-left[ ```r ggplot(data = displ, aes(x = year, y = num)) + geom_col() + * scale_x_continuous( * breaks = scales::pretty_breaks(n = 10)) ``` ] .pull-right[ <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-28-1.png" width="2100" /> ] --- ## Scale .pull-left[ ```r ggplot(data = displ, aes(x = year, y = num)) + geom_col() + scale_x_continuous( breaks = scales::pretty_breaks(n = 10)) + * scale_y_continuous( * labels = scales::label_number_si(), * expand = expansion(c(0, 0.1))) ``` ] .pull-right[ <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-30-1.png" width="2100" /> ] --- ## Context .pull-left[ Before playing with `unhcthemes` let's add some information on the chart. ```r ggplot(data = displ, aes(x = year, y = num)) + geom_col() + scale_x_continuous( breaks = scales::pretty_breaks(n = 10)) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1))) + * labs(title = "People forced to flee worldwide | 2012-2022", * caption = "Source: UNHCR Refugee Data Finder") ``` ] .pull-right[ <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-32-1.png" width="2100" /> ] --- ## unhcrthemes .pull-left[ ```r ggplot(data = displ, aes(x = year, y = num)) + geom_col() + scale_x_continuous( breaks = scales::pretty_breaks(n = 10)) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1))) + labs(title = "People forced to flee worldwide | 2012-2022", caption = "Source: UNHCR Refugee Data Finder") + * theme_unhcr() ``` ] .pull-right[ <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-34-1.png" width="2100" /> ] --- ## unhcrthemes .pull-left[ ```r ggplot(data = displ, aes(x = year, y = num)) + geom_col() + scale_x_continuous( breaks = scales::pretty_breaks(n = 10)) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1))) + labs(title = "People forced to flee worldwide | 2012-2022", caption = "Source: UNHCR Refugee Data Finder") + * theme_unhcr(grid = "Y", * axis_title = FALSE) ``` ] .pull-right[ <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-36-1.png" width="2100" /> ] --- ## unhcrthemes .pull-left[ ```r ggplot(data = displ, aes(x = year, y = num)) + geom_col( * color = unhcr_pal(n = 1, name = "pal_blue") ) + scale_x_continuous( breaks = scales::pretty_breaks(n = 10)) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1))) + labs(title = "People forced to flee worldwide | 2012-2022", caption = "Source: UNHCR Refugee Data Finder") + theme_unhcr(grid = "Y", axis_title = FALSE) ``` ] -- .pull-right[ <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-38-1.png" width="2100" /> ] ??? Is color the right property? Also notice that we haven't mapped the color to anything but we're just setting it. --- ## unhcrthemes .pull-left[ ```r ggplot(data = displ, aes(x = year, y = num)) + geom_col( * fill = unhcr_pal(n = 1, name = "pal_blue") ) + scale_x_continuous( breaks = scales::pretty_breaks(n = 10)) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1))) + labs(title = "People forced to flee worldwide | 2012-2022", caption = "Source: UNHCR Refugee Data Finder") + theme_unhcr(grid = "Y", axis_title = FALSE) ``` ] .pull-right[ <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-40-1.png" width="2100" /> ] ??? Is color the right property? Also notice that we haven't mapped the color to anything but we're just setting it. --- ## unhcrthemes .pull-left[ ```r ggplot(data = displ, aes(x = year, y = num, * fill = pop)) + geom_col() + scale_x_continuous( breaks = scales::pretty_breaks(n = 10)) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1))) + labs(title = "People forced to flee worldwide | 2012-2022", caption = "Source: UNHCR Refugee Data Finder") + theme_unhcr(grid = "Y", axis_title = FALSE) ``` ] .pull-right[ <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-42-1.png" width="2100" /> ] --- ## unhcrthemes .pull-left[ ```r ggplot(data = displ, aes(x = year, y = num, fill = pop)) + geom_col() + scale_x_continuous( breaks = scales::pretty_breaks(n = 10)) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1))) + * scale_fill_unhcr_d() + labs(title = "People forced to flee worldwide | 2012-2022", caption = "Source: UNHCR Refugee Data Finder") + theme_unhcr(grid = "Y", axis_title = FALSE) ``` ] .pull-right[ <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-44-1.png" width="2100" /> ] --- ## unhcrthemes .pull-left[ **Recommended colours from dataviz guideline**  ] .pull-right[ **Check available colours in the package** ```r display_unhcr_all() ``` <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-45-1.png" width="2100" /> ] --- ## unhcrthemes .pull-left[ ```r ggplot(data = displ, aes(x = year, y = num, fill = pop)) + geom_col() + scale_x_continuous( breaks = scales::pretty_breaks(n = 10)) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1))) + scale_fill_unhcr_d( * palette = "pal_unhcr_poc" ) + labs(title = "People forced to flee worldwide | 2012-2022", caption = "Source: UNHCR Refugee Data Finder") + theme_unhcr(grid = "Y", axis_title = FALSE) ``` ] .pull-right[ <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-47-1.png" width="2100" /> ] --- ## unhcrthemes .pull-left[ ```r ggplot(data = displ, aes(x = year, y = num, fill = pop)) + geom_col() + scale_x_continuous( breaks = scales::pretty_breaks(n = 10)) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1))) + scale_fill_unhcr_d( palette = "pal_unhcr_poc", * nmax = 9, order = c(4, 1:3, 9, 8) ) + labs(title = "People forced to flee worldwide | 2012-2022", caption = "Source: UNHCR Refugee Data Finder") + theme_unhcr(grid = "Y", axis_title = FALSE) ``` ] .pull-right[ <img src="data:image/png;base64,#02.Tidyverse_files/figure-html/unnamed-chunk-49-1.png" width="2100" /> ] --- class: inverse, center, middle # Thank you ### Questions? [post Feedback here](https://github.com/unhcRverse/unhcrverse/issues/new?assignees=&labels=enhancement&projects=&template=comment_prex_2_tidyverse.md&title=%5Blearn%5D) <a href="index.html"><i class="fa fa-indent fa-fw fa-2x"></i></a> --- # Resources - [R Essential Training: Wrangling and Visualizing Data](https://wd3.myworkday.com/unhcr/learning/course/046437bef6c810195cefc58c829f0006?type=9882927d138b100019b928e75843018d) - [learn R tidyverse](https://wd3.myworkday.com/unhcr/learning/course/046437bef6c810195cfa6ae8c4f30003?type=9882927d138b100019b928e75843018d) - [R for Excel users](https://wd3.myworkday.com/unhcr/learning/course/9c1ac0ad65dd1001b5838b7d40590001?type=9882927d138b100019b928e75843018d) - [Data Visualization in R with ggplot2](https://wd3.myworkday.com/unhcr/learning/course/046437bef6c810195cfbd3773f3f0004?type=9882927d138b100019b928e75843018d) - [Creating Maps with R](https://wd3.myworkday.com/unhcr/learning/course/046437bef6c810195cefd839751a0006?type=9882927d138b100019b928e75843018d) - [Ggplot main doc](https://ggplot2.tidyverse.org/index.html) - [The ggplot flipbook](https://evamaerey.github.io/ggplot_flipbook/ggplot_flipbook_xaringan.html#1) by Gina Reynolds - [A ggplot2 tutorial for beautiful plotting in R](https://www.cedricscherer.com/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/) and [ggplot Wizardry Hands-On](https://z3tt.github.io/OutlierConf2021/) by Cedric Scherer - Ggplot workshop [Part1](https://www.youtube.com/watch?v=h29g21z0a68)/[Part2](https://www.youtube.com/watch?v=0m4yywqNPVY) by Thomas Lin Pedersen (one of the main maintainer of ggplot)